Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make snapshot deletion faster #2

Closed
wants to merge 2 commits into from

Conversation

AmiStrn
Copy link
Owner

@AmiStrn AmiStrn commented Feb 18, 2021

The delete snapshot task takes longer than expected. A major reason for this is
that the (often many) stale indices are deleted iteratively.
In this commit we change the deletion to be concurrent using the SNAPSHOT threadpool.
Notice that in order to avoid putting too many delete tasks on the threadpool
queue a similar methodology was used as in executeOneFileSnapshot(). This is due to
the fact that the threadpool should allow other tasks to use this threadpool without
too much of a delay.

fixes issue elastic#61513 from Elasticsearch project

The delete snapshot task takes longer than expected. A major reason for this is
that the (often many) stale indices are deleted iteratively.
In this commit we change the deletion to be concurrent using the SNAPSHOT threadpool.
Notice that in order to avoid putting too many delete tasks on the threadpool
queue a similar methodology was used as in `executeOneFileSnapshot()`. This is due to
 the fact that the threadpool should allow other tasks to use this threadpool without
too much of a delay.

fixes issue elastic#61513 from Elasticsearch project
"[{}] index {} is no longer part of any snapshots in the repository, " +
"but failed to clean up their index folders", metadata.name(), indexSnId), e);
}
final int workers = Math.min(threadPool.info(ThreadPool.Names.SNAPSHOT).getMax(), staleIndicesToDelete.size());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for staleIndicesToDelete to exceed the max threadPool size?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is very much possible as the max threads in this threadpool is 5 and number of stale indices can be easily in the dozens if not more. This is defined in the ThreadPool class constructor, the max is: org.elasticsearch.threadpool.ThreadPool#halfAllocatedProcessorsMaxFive.
The reason we take the min of the two is in case there are indeed less than 5 deletions required.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thank you for the explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants